LEONARD KLEINROCK INTERVIEW 
There's a key idea out of information theory which I started to talk to you about 
which says that if you take -- whereas individual users or data streams may behave 
very unpredictably, a collection of a large number of them behaves very predictably. 
And one understands that with a bursty stream, such as a data stream, you don't 
want to have a circuit switch where you keep the connection up all the time. You 
want to be able to let go of the connection or the resources when you're not using 
them, because you tend to use them in bursts. But the other half of that story is the 
result I just told you, is (if) you have a large number of them, they behave in a very 
predictable way. So I can tell you just how many people I can support on one 
communications link if you tell me the bandwidth of the link and the bandwidth 
requirements of each user, even though each user's unpredictable. And it's 
something you can calculate as well. Without being condescending, if I told you I 
had 100 users each using 5% of a communications line on average, highly bursty, 
okay? So you'll (lease) the line 5% of the time. And if you have 100 users, how 
many lines do you need? The answer is you need five lines, because each user uses 
essentially 1/20th of the line. 
I see. 
Now that's a calculation without allowing for statistics. And the point is the 
unpredictability goes away when you gang up a bunch of users independently. I 
didn't say that very well. Another way to say it is this. Let's get to gambling. If 
you toss a coin, say half the times it's heads, half the times tails. You toss if five 
times and you get say four heads and one tails. You're not surprised. Even though 
you expected like well, say six times you tossed it. You split the three heads, three 
tails. Right? Make it five heads and one tail. If you toss it a million times, you'd 
better get almost exactly a half a million heads and a half a million tails, and if you 
got like 480,000 heads and 520,000 tails, that coin is not fair. Guaranteed. The 
likelihood of that being fair is one in billion, billion, billion. You know, I can give 
you the numbers if you like. And the way to characterize that is to say that if you 
let nature take a good crack at you, she's going to expose her average behavior to 
you. 
Sounds like something Shannon would say. 
In fact, that's almost Shannon's words. I took that notion and applied it to the 
world of packet switching -- of data switching -- and said, "Look, if we got these --." 
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I'm sorry. But where are we chronologically now? 
Chronologically is when I started working on my dissertation. 
And that would be what year? 
I submitted the proposal sometime in 1959. I forget when. I think the fall. You 
know, I came in the fall of '58 and worked for about a year in the Chess 
Program and then I went into networking stuff. And so I began to investigate the 
behavior of message switch networks. And I started to tell you the difference 
between message switching and package switching. In message switching, you take 
the entire message and you transmit it as a bulk, hop, hop, hop. And the trouble 
there is if the message is large, each time you transmit it over a communications line, 
at the other end, you've got to wait until it all comes in before you start relaying it 
over the next hop. If the message is very long, it's going to take a long time for each 
hop and many hops and you've got a long delay. And that's the way the teletype 
network was working and a variety of others in those days. I analyzed the behavior 
and was able to show that the response time of these things was dependent upon the 
life of the message, and if you could use shorter messages, you'd have a much better 
response time. Shorter messages means break them up into little pieces called 
packets. We didn't call it packet switching. I was able to show for analysis the' 
effect of the size of the block you were transmitting and what it meant to the overall 
delay. And in particular, nobody had been considering queuing effects. Everybody 
typically assumed there was nothing in your way when you went hop, hop. But each 
node you got to wait as well. And if you got to wait for other long guys, it magnifies 
the effect of the size of the message. 
Sure. 
Okay. So that was built in and all of that came out in the beginnings of the theory 
I was developing. So what I did is I analyzed -- 
I'm having trouble making the connection. How did you go from this whole idea of, you 
know, average behavior -- 
Oh, how did that get into the picture? 
-- into data switching. It isn't making sense to me. 
Oh, hit the point. You got a communications line. That's the good of bursty stream. 
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st:ucture of the network that's supporting that kind of traffic? And so I analyzed 
a number of things. I started out -- first of all, one had to make a model of this. 
And I developed an analytic model that very well captured the behavior of the real 
network. There were some very important things I had to do. Mathematically you 
can't solve these networks. Never mind design. But if I give you a network, you 
can't analyze it. They're just too hard. For reasons I can explain. You probably 
don't want to know. But there were some essential problems. So I had to introduce 
some assumptions and some approximations which were exactly the right 
approximations to make the model yield to analysis and still represent the real world 
very accurately. The key assumption is something called the independence 
assumption. Do you care about it? 
No. 
All right. So the point is I developed a workable analytic model for the performance 
of data networks. I then went into some design issues. Decided how you should lay 
out a network, how you should decide how fat the line should be, and how you 
should move the traffic through the network -- that's called routing. And I 
developed and invented some algorithms. Analyzed them and optimized them. The 
optimization, of course, was the design step. Now by the way, in that same work -- 
you understand nobody was interested in this stuff at that time. Nobody. This was! 
a piece of work that I was doing because I was interested in it. I had to invent some 
of my own tools. I never studied queuing theory. I had to develop some tools of 
queuing theory and take what was known, because this was a problem in delay which 
is what queuing theory is all about. In fact, nobody at that time had used queuing 
theory to evaluate computer systems at all. And so a second compliment of my work, 
besides that in networks, was in the other really exciting development of that time, 
which was timesharing. Timesharing was just happening in the very early '60's 
when I was into this. Okay? '59 to '62. 
Uh-huh. 
In fact, one of the very first timesharing systems developed was at MIT. 
Right. Yes, I know. 
And so I quickly took that and I developed a model for it. The first analytic papers 
on timeshared modeling was in my thesis as well, and that was a whole other 
development that was basically developed in the 60's, which you probably don't care 
about. 
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Well, I care about it in so far as timesharing's effect on networking. The relationship 
between timesharing and networking. 
Okay. Well, it's a good point. It's one of the reasons we had to develop a network 
in Arpa, and I'll explain that in a minute. Anyway, you have to understand the 
common element here was queuing theory. Queuing theory asks the following 
questions. What's the (proof) of the system? By the way, I'm going to couch these 
questions using terminology that makes sense to networking, as opposed to makes 
sense to somebody studying  of the processes. The mathematician doesn't use 
these words. Mathematician talks about other things. . Capacity. How 
much can you support? Response time. Buffer size. All the systems elements that 
you need to concern yourself with when you're building something like a network or 
like a timesharing system. How many users can you support? How long will it take 
me to get a response? How long will it take the message to get through the network? 
How fat must be the channel here? How much buffer storage do you need in the 
switch? So these are the things I was concerned about and the obvious analytical 
tool was queuing theory. 
A second tool, by the way, was called network flow theory. And that's, by the way, 
what Howie Frank was involved with, and we'll get to that in a minute. But you can 
address some of the network flow problems through' queuing theory and I'll tell you 
what the overlap was later. Now in addition to developing these tools and these 
fundamental principles of data networking, I did this monster simulation. There's 
where the couple at Lincoln Lab was on . They had this wonderful machine 
called TX-2, which I could get my hands on, and you know, the summers I was there, 
and I was there with Larry and those other guys, and Wes Clark, and all the guys. 
You probably didn't hear names like Froggy and Stocky Brand and Peterson. All 
the designers of the TX-2. 
Describe the TX-2 to me. 
Ah, [smacks lips]! It was a jewel. You have to understand. It was a machine in 
development while we were using it. It was the second transistorized machine built 
at Lincoln Lab. It was well ahead of its day. It had instructions that were not used 
in the large machines for like a decade or two later. With one instruction, I could - 
- first of all, it had a 36-bit world, which is large, and you could break it into four 
sections of nine and do four operations independently on these nine-bit segments, or 
you could break it into one 18 and two nines or two 18's, as you like. With one 
instruction I could test a bit to see if it was a one or a zero, and based on that test, 
! could either change it or not, set it to a one or not, and rotate the whole word 
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around one bit. So if I was looking at say the least significant position -- do you 
know what I'm talking about? Here's 36 bits lined up. Like 36 integers, the least 
significant integer is the one's position. Well, similarly in the digital world. If you 
look at that bit, you do all the examination and testing of it. When you're done with' 
that, all on one instruction, you could rotate the word around so the second digit 
appeared in the first position, and you keep testing it. So it's a very efficient 
machine. Anyways, it was very advanced. Let me just put it that way. It was built 
by a collection of hardware designers and logic designers, who themselves were the 
programmers. You see, this was a machine where you were not a specialist, but you 
did it all, and these guys were super good. 
And Larry was involved with it? 
He was not one of the designers. He came in later after the machine was built. But 
he wrote the second (compiler). It was called -- 
The story you were telling last night about how he came in and wrote. 
Yes. It was amazing what he did. I mean. I'm trying to remember the name of the 
compiler he wrote. I want to say Meta, but that wasn't his. I'll think of it. I still 
have my program, by the way. But the first program I ever wrote for that machine; 
you know, I was a neophyte programmer. The way you wrote programs for that 
machine, you'd have this big console there. Okay? Big room with this machine. 
People working on different pieces of it. And you'd sign up for hours at a time or 
half an hour at a time. You sit down there and there was a bank of little toggle 
switches and you'd set the ones and zeros in the toggle switches, which was your 
program or possibly you'd put it on a paper tape. This is no Fortran. This is called 
machine language with a little bit of assembly capability. Anyway, so I sat down and 
tried to run a program and it wouldn't run. You know, these things just stop 
immediately and Peterson came by. I forget his first name. He was one of these 
guys who had been designers. He was an expert programmer. And he watched me 
and he said, "Let me help you." He said, "Program's fine. Just needs a little bit of 
tuning." So he tuned a little bit and choo! 
Really? 
I mean, it was like artistry. You know, these guys they really understood. 
And you hadn't done much programming? 
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hadn't done any programming before that. 
So it was a whole new world. 
Oh, it was unbelievable. I was suddenly thrust in with these guys at Lincoln. These 
were professional guys. They knew how to build paper tapes. Punch paper 
machines. The first of Xerox's Xerox machines, I believe, was there. It was a huge 
machine with a big drum and it spit out paper that wide and it was a continuous 
role, and we had these big scissors, and when you turned that printer on, you had 
better go over there and start cutting. If you fell behind, it was like the Sorcerer's 
Apprentice. It would be all over the place. 
So what was it like to get introduce to computers. 
You never got introduced formally. You just were thrown in this group of people 
who were using computers and you get some documentation about the logic design, 
about the instruction set, about the functional description of the machine, and you'd 
be around these guys and you'd ask questions, and you'd look at some literature and 
you'd watch them and you'd start doing things. I had a job. You know, the first 
guy I worked with at Lincoln Lab was Ken Olsen. 
Oh, right. That's in your 
You were building some circuit boards? 
Well, I did a few things. First thing he wanted me to do was to study the behavior 
of gamma rays on flipflops. A flipflop is a basic element of a register. So we started 
doing that. And then he had me design something called a variable pulse delay unit. 
You hit the clock here and some fixed amount of time later, you want a pulse to 
come out, and that's variable. He wanted me to build it. I did. And he kept having 
me improve it here and there. He was very gentle, you know. He says, "That's good. 
But how about trying to do this." 
Now, I thought he was in Wes's group. 
You know, I don't know what the management structure was then, because it was 
my first summer there, '57, and at the end of the summer, he left. Yes, he was in 
the TX-2 group, but I don't know who was in charge at that time, to be honest with 
you. 
And that was Wes. Wes was at TX-2. 
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Yeah. Yeah. Wes was I think one of the . He was the manager of TX-2 
development. At any rate, then Ken left, and I did not go with him, of course, and 
I'm happy about that. But he took a lot of the good designers with him, by the way. 
Yeah, I've heard that. 
Anyway, where was I? What was the environment like? I got to my simulation. 
That's where we entered the TX-2 thing. That machine was available to me, because 
I was an employee and I was able to get that machine from midnight to 7:00AM, a 
huge shift, four days a week for a period of three months. This was toward the end 
of my dissertation. In order to debug my (Mansa) simulation. Mansa was 2500 lines 
of code. Trouble is those four days were not contiguous days. So it would be like 
Monday, Tuesday, Thursday, Saturday, or something. So you can imagine what 
happened with my sleep habits. You probably remember on the video clip the story 
about Larry Roberts and the way he scared the hell out of me one night? 
All right. Tell me that story again. You were a little dramatic with it. Tell it to me just 
in plainer -- 
Well, what happened was, you know, I'd be there. You know, I'd arrive. I'd be 
working all day pretty much and midnight came and I'd take the machine overJ 
And you know, people were gone by then so I was all by myself with this wonderful 
machine with all these parts which could break down and, you know, it was a million 
dollar machine, which to a kid like me was very important. And there were always 
pieces missing that were under repair. The registers -- the panels on the console 
were about six inches and about an inch and a quarter high, and these were the 
kinds of modules that plugged in, and there were little LED's that would light up 
and (all you might expect). And you'd get to know the machine like your body, you 
know. In fact, they had an audio -- a speaker on one of the registers in the CPU, 
and when it went up and down, it would make a noise, and when you set it running 
on the program, it would go [makes noises], and that was good. It was doing things. 
And if it ever went [whistles] it meant it was in a loop and you've been had. So you 
got to know. You heard what the paper tape reader sounded like and there was air, 
a cool fan here, and you knew what that was, and if anything was wrong, you'd hear 
it immediately and it would scare the hell out of you, because you don't want 
something to break, because that's an indication maybe something's crashing. So one 
night I'm there at about three in the morning and you really get tired and grumpy 
and you go into a trance, because half the time you're waiting for the thing to 
compile, so you're doing nothing. And I heard this funny sound. This s-s-s-s-s sound 
and right away my adrenalin started going what is wrong here? And it is dramatic. 
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And I looked around and in one of the empty spaces -- one of these little modules 
was missing -- I saw this pair of eyes looking out. Larry had snuck into the room 
and tried to scare me and he succeeded. I could have killed him! 
That's wonderful. So was that like Larry to play a joke on you? 
Not too much. That was especially amusing. He and I were good buddies. We did 
a lot of weird things together. Lots. I can't tell you all of them. 
Really? Well, you've told me about the Colorado River. I guess that was weird. 
Yeah, but no. We did some really strange things. 
Just things young people do? 
Yeah, you know, mischievous things. 
Oh, mischievous. 
Oh, yeah. 
Anything almost innocuous you could tell me. about? 
No. The gambling (in Silva) are exciting enough. At any rate, typically on those 
nights, come 7:00AM, one of my classmates would come in. I don't remember who 
it was exactly. See my thesis supervisor graduated five students. First was Erwin 
Jacobs, Ed Hoffstetter, Jack Rosenreid, then myself, and Hersch Lumas. They came 
in two, one, two, and after that he left. Now Hersch Lumas was with me and he 
would use the TX-2 as well. He'd come in at 7:00AM typically bright-eyed, bushy 
tailed, well-shaven, smelling good, and I'd be, you know, drinking coffee and bearded 
and dirty, and I'd look at him and say, "Get out of here!" You know, do the 
contrast and then I'd go home. And I'd always try to get home before the sun came 
up, because if the sun came up, I knew I had lost the night and I just wanted to get 
into bed before the sun rose, and I often didn't make it. Once I went home in a 
snow storm, just freshly snowed, and I really wanted to get home. Had to climb a 
hill to get home and the car was slipping, , and I just kept -- these were 
new tires -- I kept gunning and gunning, and I ruined a full set of new tires getting 
up the hill. 
And you were a poor graduate student. 
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Yeah. Yeah. New means I bought them in a junkyard. 
Right. Okay, we're getting off a little bit. So you were -- 
So I did this big simulation. The one thing that was kind of a semi-heroic thing, 
when I wrote my program in simulation, I wrote the full simulation before I 
debugged any of it. I just chose to do it that way. That's a very bad programming 
practice in general, but I like to take these challenges on. And if that simulation 
program didn't work, I was in danger of not getting a PhD. 
Really? 
Well, I had to prove my approximations were good. I mean, you know, you can 
assume anything. You had to prove them. And I finally did get it working, 
fortunately, in that massive effort over this period of three or four months, and it 
proved out my theory and that was a big success. But the TX-2 -- without that 
machine, I wouldn't have been able to take on that ambitious kind of simulation. It 
was highly interactive. I had things you don't even have now on machines. We had 
light buttons on the machine. It was a little pointer and you could point it with a 
light pen. Light pen was a photoelectric device connected to the computer and if you. 
point it to -- for example, suppose it had this little thing as a mosaic. If I point to 
it, then the pen only sees that light when the screen activates it. So the pen knows 
where that is because it knows when it saw that and we have and timed 
position. So if I hit that button, it means the one thing and this button means 
another. It was a highly interactive program. You could stop it. You could see the 
histograms growing as the data came in. You could see the traffic moving. And you 
got the analytic results as well -- the numeric results. Anyway, I still have that 
program and, in fact, I have a piece of paper which indicates when they said, "The 
TX-2 is being decommissioned." Which meant that program would never again run 
in any machine. 
Really? Tell me about the TX-2 SDC experiment. Do you know -- 
That came later. 
That was in what? '62? 
I believe it was '62 or '63. 
Do you know much about it? 
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No. Just I didn't see it happen. I just know how Larry reported it. 
So I should really ask Larry more about that. 
Yes. That for sure. Not me. But let's go to the timesharing thing. Just go off on 
that tangent for a minute. Timesharing came into its heyday in the early '60's for 
a very good reason, by the way. The early use of computers, or the way I describe 
it, you got to sit down (on council) and get free use of the machine. Not free. No. 
Full use. But then when the IBM world came on, they went to batch processing, and 
now you would take two or three days to do the smallest little thing to discover one 
error in your code, and so the turnaround time was awful. So timesharing was 
invented to solve that problem. Where you can get rapid response because more 
than one user could be using the machine at the same time. So these timesharing 
machines became popular and then Arpa, of course, began to support Computer 
Science research. In the early 60's, one of the main things they supported was 
timesharing. I was doing a lot of analytical work then deriving new ways to use 
timesharing systems. I invented something called processor sharing and I analyzed 
round robin. I invented a whole bunch of algorithms. 
Now Liclighter had a lot to do with timesharing. 
Liclighter. Yeah. Timesharing. They set up the Arpa office, of course. But he 
didn't do any analysis. Okay? He was just supporting that kind of work. 
Right. 
As a result, I told you about the PDP-10 problem, okay? Well, before the network 
was there, these machines were being deployed out of the network. So you're a 
researcher at the University of Utah and Arpa comes and agrees to support your 
research. You're going to say, "Listen, buy me a computer so I can do research." 
"Buy you a computer. Fine." So a lot of these timesharing systems were sprinkled 
around the research community, and every time a new researcher came in, they 
wanted to get their own machine. Trouble is each of the machines that they put out 
there began to develop in a unique fashion. Once you give it to a bunch of smart 
guys, they're going to do things with it. Suddenly, there's a unique resource at each 
of these facilities. So how do you make all of that resource available? You can't 
duplicate it everywhere. So the notion of needing a network came on. That was one 
of the prime motivations of the network. 
Well, this is what Bob Taylor talks about. 
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It served a military use. Okay? But in order to support his own research 
community he wanted to make the Iliac 4 and the simulation work at UCLA and the 
graphics work at Utah available to everybody. 
This is Taylor? 
Taylor and Larry. So resource sharing is quoted corrected to be the prime mover 
for generating the network. 
Sure. 
Larry basically used that argument to justify the economics of the network. He said 
if we had to replicate this -- 
Yeah, I know. 
You know. Now that's an example of Larry gilding a lily. Nobody is going to put 
an Iliad everywhere. But he was effective. He would go to Congress and the Budget 
Committee and say, "Listen, it will cost you "X" million to do this and a fraction of 
that with the network, and the network's only costing you so much. It's a good deal." 
Larry will exaggerate and to good effect, you know. Okay. So now where are we- 
I'm finishing up my dissertation. It's very well received at MIT. In fact, it's so good 
they decided to make it into that McGraw-Hill book series from Lincoln Lab. The 
book that you've got a copy of. Only the best works have made it to the book and 
I was very flattered when they said they wanted to do that. I was surprised at that. 
Because again, no one was really interested in this area. So it was on the analytic 
and technical strength that it did that. Not on the application. I remember Peter 
Alias was on the committee where I had to present this final result. I don't know if 
you know him. He's one of the great information theorists, and Shannon, and 
(Arthurs), and a queuing theory type from the management school again, Galloho. 
What was his first name? I want to say Ed, but anyway. 
Did you consider yourself a queuing theorist at the time? 
No. Engineer. Engineer who was using queuing theory. As it turns out, in my 
thesis, I developed some brand new stuff in queuing theory. Something called a 
conservation law, which is still being generalized to this day. So I did some 
fundamental work with queuing theory, but I didn't do it for the purpose of queuing 
theory. I did it because I wanted to develop these engineering systems. 
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Is the conservation law relevant to anything? 
Yes. Certainly to timesharing and, in fact, to networking these days. Here's what 
it is. Ready? 
Uh-huh. 
There's a thing called priority queuing where some groups have to be treated better 
than others. Generals versus privates. Okay? Short messages versus long messages. 
So you have a whole bunch of priority groups. Now it stands to reason that if you 
help one group, which means you give them preferential treatment, you're going to 
hurt another group. The question is what's that balance? If I reduce your waiting 
time by two seconds, do I increase his waiting time by two seconds? And I 
characterize exactly what goes on there. 
Interesting. 
I said there's a certain conservation of something which is always the case, no matter 
what the algorithm is. No matter how you balance people off, you can't ever beat 
this. It's like the (Heizenburg) principle. You're going to have that much. And it 
was very important in showing that -- don't ever try to beat it, because you can't; 
and it's a good test of when you have consistent proof of the way an algorithm looks, 
etc. And it's been generalized considerably since that work, which is nice. Again, 
I did the first model of timesharing, as I said, which launched the whole field of 
performance evaluation. But the answer is no. I never considered myself a 
mathematician type queuing theorist. In fact, however, there was an important 
contribution here in the following sense. You can take this for what it's worth. 
Queuing theory was begun by an engineer, a guy named (Airlang), in the early part 
of the century, and he developed lots and lots of stuff. He was an engineer. He was 
solving telephone engineering problems. Okay? He was my kind of guy. In the 
early 30's, in fact, in 1928, a book was written based on his work which publicized 
his work. The mathematicians took note of it then because they saw this book come 
out. In fact, William Feller -- you may know Feller, F-e-l-l-e-r. 
Huh-uh. 
He wrote the key books on